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Abstract: 

We prove the existence of optimal transport maps for the Monge problem when the cost is a Finsler dis- 
tance on a compact manifold. Our point of view consists in considering the distance as a Mane potential, 
and to rely on recent developments in the theory of viscosity solutions of the Hamilton-Jacobi equation. 

Resume: 

On montre l'existence d'une application de transport optimale pour le probleme de Monge lorsque le cout 
est une distance Finslerienne sur une variete compacte. Le nouveau point de vue consiste a considerer 
la distance comme un potentiel de Mane, et a exploiter des developpements recents sur les solutions de 
viscostite de l'equation de Hamilton-Jacobi. 
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The Monge transportation problem is to move one distribution of mass into another in an 
optimal way. Before we discuss this problem, let us describe a precise setting. We fix a space 
M, which in the present paper will be a manifold, and a cost function c G C(M x M, M). Given 
two probability measures [iq and m, we call transport map a Borel map F : M — ► M that 
transports /j,q onto H\. An optimal transport map is a transport map F that minimizes the total 
cost 




c(x,F(x))dfi 



among all transport maps. In many situations, optimal transport maps have remarkable ge- 
ometric properties, at least at a formal level. Some of these properties were investigated by 
Monge at the end of the eighteenth century. 

The question of existence of optimal transport maps was discussed much later in the liter- 
ature. Some major steps were made by Kantorovich in 1942. He introduced both a relaxed 
problem and a dual problem that opened new approaches to the existence problem. When the 
cost is the square of the distance on an Euclidean vector space, Brenier proved the existence of 
an optimal transport map in [6] and also provided an interesting geometric description on the 
optimal maps, which have to be the gradient of a convex function. The argument was simpli- 
fied, taking advantage of the Kantorovich dual problem, by Gangbo, [15], and extended in many 
directions by Gangbo, McCann, [16J and other authors, see our paper [4] for more details. 

The case where the cost function is the distance on an Euclidean vector space is very natural, 
but more difficult. Sudakov announced a proof of the existence of an optimal map in 1979, but 
a gap was recently found in this proof. The strategy was to decompose the space into pieces of 
smaller dimension on which transport maps can be more easily built, and to glue these maps 
together. An essential hypothesis of the result is that the measure fj,o is absolutely continuous. 
In the construction, it is necessary to control how this hypothesis behaves under decomposition 
to the subsets. Sudakov was not aware of these difficulties, and made wrong statements at that 
point, as was discovered only much later. It is interesting to notice for comparison that similar 
kind of difficulties had been faced and solved ten years earlier by Anosov in his ergodic theory 
of hyperbolic diffeomorphisms. 

Correct proofs in the spirit of the work of Sudakov were written simultaneously by Caffarelli, 
Feldman and McCann in [7], Trudinger and Wang in [26j and slightly afterwards by Ambrosio, 
PQ. These authors manage to build a decomposition of the space into line segments that have 
to be preserved by transport maps, called transport rays. They prove that the direction of 
these rays vary Lipschitz continuously. This regularity implies that the absolutely continuous 
measure /io has absolutely continuous decompositions on these rays. See [1] for a remarkably 
written discussion on these works and of Sudakov's mistake. Before the proof of Sudakov were 
completed in these papers, Evans and Gangbo had provided a different proof under more strin- 
gent hypotheses in [11]. This proof is long and complicated, but it now appears as the first proof 
of the existence of a transport map in the case where the cost is a distance. 

The methods inspired from Sudakov seem to allow many kinds of generalizations. The paper 
[7] treats all norms whose unit ball is smooth and strictly convex. It is worth mentioning that 
flat part in the unit ball represent a major difficulty. An important progress have been recently 
made in [3], which studies norms whose unit ball is a polyhedron. In another direction, Feldman 
and McCann [14J have treated the case where the cost is the distance on a Riemannian manifold. 
In this generalized setting, transport rays are not any more line segments, but pieces of geodesies. 
It is in this direction we will pursue in the present text. 

Our goal is to prove the existence of transport maps for Finsler distances on manifolds 
(possibly non symmetric distances). In order to avoid superficial (and less superficial) additional 
technicalities, we shall work on a compact manifold M. Our first novelty is a new approach of 
the geometric part of the proof, that is the decomposition into transport rays. We believe that 
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this new approach is interesting because, beyond being more general, it enlightens new links 
between the Monge problem and the general theory of Hamilton-Jacobi equations as presented 
by Fathi in [12]. In fact, all the relevant properties of the decomposition into transport rays are 
obtained by straightforward applications of results of [12J. In order to finish the proof, we rely 
on a secondary variational principle, in the lines of [2] and [3|. Our treatment of this secondary 
principle is quite different from these papers, and it is, we believe, shorter and clearer than the 
methods previously used in the literature. 

This paper was born during the visit of the first author to the Bernoulli center in EPFL, 
Lausanne, in Summer 2003. We wish to thank this institution for its support. 

1 Introduction 

We state two versions of our main result, and prove the equivalence between these two state- 
ments. 

1.1 Optimal transport maps for Finsler distances 

In the present paper, the space M is a smooth compact connected manifold without boundary. 
We equip the manifold M with a C 2 Finsler metric, that is the data, for each x G M, of a 
non- negative convex function v i — ► \\v\\ x on T X M such that 

• [ | At; 1 1 ^ = A||u||a. for all A > and all v € T X M (positive 1-homogeneity) , 

• the function (x,v) i — > \\v\\ x is C 2 outside of the zero section, 

• for each x E M, the function v i — > \\v\\ x nas positive definite Hessian at all vectors v ^ 0. 

As a consequence, H^Ha; — exactly when v — (positivity). Note that || • is not assumed 
symmetric (that is, || — v\\ x ^ \\v\\ x is allowed). In standard terminology, the function v i — ► 
H^llx is a Minkowsky metric on the vector space T X M. See [5] for more one Finsler metrics 
(in particular Theorem 1.2.2 and paragraph 6.2). We define the length of each smooth curve 
7 : [0, T] — ► M by the expression 

Kl) = [ \W)h(t)dt. 
Jo 

The Finsler distance c is then given by the expression 

c(x,y) = inf /(7) 

7 

where the infimum is taken on the set of smooth curves 7 : [0, T] — > M (where T is any positive 
number) which satisfy 7(0) = x and j(T) = y. Note that the value of the infimum would not 
be changed by imposing the additional requirement that ||7(i)|| 7 (t) = 1- The Finsler distance 
c(x, y) is not necessarily symmetric, and thus is not properly speaking a distance. It does satisfy 
the triangle inequality, and c(x, y) = if and only if x = y. 

We shall consider the Monge transportation problem for the cost c. Given a Borel measure 
fj,Q on M, and a Borel map F : M — * M, we define the image measure i^/xo by 

F m (A) := h>(F-\A)) 

for each Borel set A C M. The map F is said to transport hq onto \i\ if F$hq = pL%. We will 
present a short proof of: 
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Theorem 1. Let c(x, y) be a Finsler distance on M . Let fio and \x\ be two probability measures 
on M , such that /iq is absolutely continuous with respect to the Lebesgue class. Then there exists 
a Borel map F : M — > M such that Fj/xq = Hi; and such that the inequality 

j c(x, F(x))dfio ^ / c(x,G(x))dfiQ 
Jm Jm 

holds for each Borel map G : M — > M satisfying G^ho = Ml- ^ n other words, there exists an 
optimal map for the Monge transportation problem. 



1.2 Optimal transport maps for supercritical Mane potentials 

We shall now present a generalization of Theorem 1, which is the natural setting for our proof. 
A function L : TM — > R is called a Tonelli Lagrangian if it is C 2 and satisfies: 
Convexity For each x G M, the function v i — ► L(x,v) is convex with positive definite Hessian 
at each point. 

Superlinearity For each x G M, we have L(x, v)/\\v\\ x — > oo as \\v\\ x — > oo. 
Given a Tonelli Lagrangian L and a time T g]0, oo), we define the cost function 

i-T 

c^(x,y) = min / L(j(t),j(t))dt 

where the minimum is taken on the set of curves 7 € C 2 ([0,T],M) satisfying 7(0) = x and 
7(T) = y. That this minimum exists is standard, see |21| or [12]. The function 

c L (x,y) := inf c T (x,y) 

Te]0,oo) 

is called the Mane potential of L. It was introduced and studied by Ricardo Mane and then his 
students in [20l [8] . Without additional hypothesis, the Mane potential may be identically —00. 
So we assume in addition: 

Supercriticality For each M 2 , we have c(x,y) + c(y,x) > 0. 

The following result of Mane [20\ [8] makes this hypothesis natural. 

Proposition 1. Let L G C 2 (TM,R) be a Tonelli Lagrangian. For fceR, let c L+k be the Mane 
potential associated to the Lagrangian L + k. There exists a constant ko such that 

• For k < ko, then c L+k = — 00 and the Lagrangian L + k is called subcritical. 

• For k ko, the Mane potential c L+k is a Lipschitz function on M x M that satisfies the 
triangle inequality 

c L+k (x, z) ^ c L+k (x, y) + c L+k (y, z) 
for all x,y and z in M. In addition, we have c L+k (x,x) = for all x G M. 

• For k > ko, the Lagrangian L is supercritical, which means that c L+k (x, y) + c L+k (y, x) > 
for x 7^ y in M . 

We will explain that the following theorem is equivalent to Theorem 1. 

Theorem 2. Let c L (x,y) be the Mane potential associated to a supercritical Tonelli Lagrangian 
L. Let no and [i\ be two probability measures on M, such that (io is absolutely continuous 
with respect to the Lebesgue class. Then there exists an optimal transport map for the Monge 
transportation problem with cost c L (x,y). 
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1.3 Supercritical Mane potentials and Finsler distances 

We prove the equivalence between Theorem 1 and Theorem 2. To each Tonelli Lagrangian L, 
we associate the Hamiltonian H G C 2 (T*M, R) defined by 

H(x,p) = max p(v) — L(x,v) 

v£T x M 

and the energy function E £ C 2 (TM,R) defined by 

E(x,v) = d v L(x,v).v — L(x,v). 

The function H is also convex and superlinear. The mapping d v L : TM — ► T*M is a C l 
diffeomorphism, whose inverse is the mapping d p H. We have E = H o d v L. 

Lemma 2. Let L be a supercritical Tonelli Lagrangian. There exists a constant K such that, 
for each x ^ y in M , there exists a time T g]0, K] and a minimizing extremal 7 € C 2 ([0, T],M) 
such that 7(0) = x, 7 (T) = y, and L(^(t),j(t))dt = c^(x,y) = c L (x,y). Moreover, if 7 is 
such a curve, then 

£7( 7 (t), 7 (t))=0. 

Proof. We shall prove, and use, this lemma only in the case where L is positive. Note first 
that the function (x,y,T) 1 — > c^(x,y) is continuous on M x Mx]0, 00). It is not hard to see, 
in view of the superlinearity of L, that the function T 1 — > c^(x,y) goes to infinity as T goes 
to zero if x 7^ y. On the other hand, setting 8 = inf L > 0, we obviously have the minoration 
^ ST. Since c\ is bounded, this implies the existence of a constant K such that > c\ for 
T ^ K. As a consequence, the function T 1 — > CT(x,y) reaches its minimum on ]0, K] for each 
x + y. 

Let x 7^ y be two points on M. There exists a T G]0,if] such that c^(x,y) = c L (x,y). 
Now by standard results on the calculus of variations, there exists a C 2 curve 7 : [0, T] — ► M 
satisfying 7(0) = x, j(T) = y and J Q L(^f(t),j(t))dt = c L (x,y). In addition, this curve satisfies 
the Euler-Lagrange equations, and in particular the energy £J( 7 (t), 7(i)) is constant on [0, T]. 

Let 7a : [0, AT] — ► M be defined by 7A (t) = j{t/X). The function 

f-TX f-T 

f(X) := / L( 7A , 7A)dt = A / L( 7 , A- x 7 )^ 

JO JO 

clearly has to reach its minimum at A = 1. On the other hand, a classical computation shows 
that the function / is differentiable, and that f'(l) = — E(j(t),j(t))dt. This proves that 
£7( 7 (t), 7 (t)) = 0. n 

The following proposition implies that the transportation problem for Finsler distances, the 
transportation problem for supercritical Mane potentials, and the transportation problem for 
the Mane potentials of positive Tonelli Lagrangians are equivalent problems. 

Proposition 3. If L is a supercritical Tonelli Lagrangian, then there exists a Finsler distance 
c (associated to a C 2 Finsler metric) and a smooth function f : M — > R such that 

c(x, y) = c L (x, y) + f(y) - f(x). 

Conversely, given a Finsler distance c (associated to a C 2 Finsler metric) there exists a positive 
Tonelli Lagrangian L such that 

c(x,y) = c L (x,y). 
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Proof. The first part of this proposition is the content of [T7j. For the converse, we consider 
the Lagrangian 

£(,,.) _i±M. 

Note that the associated energy function is 

IMI 2 -i 

E(x,v) = ™ . 

v j ) 2 

Let us now consider a positive Tonelli Lagrangian L such that < L ^ L and such that L = L 
on the set {||«||x ^ 1/2} C TM. In order to see that such a Lagrangian exists, consider a smooth 
convex function / : [0, oo) — > [0, oo) that vanishes in a small neighborhood of and such that 
f(s) ^ (1 + s 2 )/2 with equality for s ^ 1/2; and consider a smooth function g(x, v) : TM — ► R 
such that (7 is positive and &2g(x, v) is positive definite where /((Mix) vanishes, and such that g is 
zero on {||t;|| x ^ 1/2} CTM. It is easy to check that the Lagrangian L(x, v) = f(\\v \\ x ) + eg(x, v) 
satisfies the desired requirements when e > is small enough, 

Since L ^ L, we have c L ^ c . Moreover, for x 7^ y, let 7 : [0, T] — > M be the optimal (for 
c L ) trajectory obtained by lemmaEl We have E(j(t),j(t)) = £ , (7(t), / y(t)) = hence ||7|| = 1. 
As a consequence, 

c L (x,y)= [ L^(t),j(t))dt = f L{ 1 {t)^{t))dt = l{ 1 )>c{x,y). 
Jo Jo 

We have proved that c ^ c L ^ c L . These are equalities because c L ^ c. Indeed, for all 
smooth curves 7 : [0, T] — > M which satisfy 7(0) = x, j(T) = y and ||7(t)|| 7 (t) = 1 with 

T = > 0, we get c L (x,y) ^ L(j(t),j(t))dt = Since c(x,y) is the infimum of the 

lengths of such curves, we have c L (x,y) ^ c(x,y). □ 



1.4 General convention 

In the sequel, we will prove Theorem 2 for a positive Lagrangian L and denote c L simply by c. 
In view of Proposition [3l this implies the general form of Theorem 2, as well as Theorem 1. We 
fix, once and for all, a positive Tonelli Lagrangian L, and a positive number 5 > such that 

L(x, v) ^ 5 

for each (x, u) € TM. 

The general scheme of our proof is somewhat similar to the one introduced by Sudakov, 
and followed, in [7], [26], [I], [H], [2] arid other papers. Like these papers, our proof involves 
a decomposition of the space M into distinguished curves, called transport rays. We introduce 
these rays in section [3] and describe their geometric properties. In this geometric part of the 
proof, our point of view is quite different from the literature as we emphasize the link with 
the theory of viscosity solutions as developed in [12], and manage to obtain all the relevant 
properties of transport rays as a straightforward application of general results of [12] . For 
the second part of the proof, all the papers mentioned above involve subtle decompositions 
of measures on these transport rays. It is at this step that the paper of Sudakov contains a 
gap. We simplify this step by introducing a secondary variational principle in section [2j Note 
that secondary variational principles have already been introduced by Ambrosio, Kirchheim and 
Pratelli in [3] for related problems. This secondary problem is studied in section [5] by a quite 
simple method, which, surprisingly, seems new. This methods allows a neat clarification of the 
end of the proof compared to the existing literature. All the difficulties involving measurability 
issues and absolute continuity of disintegrated measures are reduced to a single and simple 
Fubini-like theorem, exposed in section SI 
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2 Transport plans 



We introduce our secondary variational principle and recall the necessary generalities on the 
Monge problem. Beyond the references provided below, the pedagogical texts [U [MJ [57J may 
help the reader who wants more details. We define the quantity 

C(/i Q ,/zi) := inf / c(x, F(x))dfi , 
F Jm 

where the infimum is taken on the set of Borel maps F : M — > M that transport /j,q onto It 
is useful, following Kantorovich, to relax this infimum to a nicer minimization problem. A Borel 
measure [i on M x M is called a transport plan if it satisfies the equalities Wi&fX = Hi, where 
ttq : M X M — > M is the projection on the first factor, and tt\ : M x M — ► M is the projection 
on the second factor. Clearly, any transport map F can be considered as the transport plan 

(Id x F)tt/x . 

Following Kantorovich, we consider the minimum 

K(hq, hi) = mm / cdfi 
V JMxM 

taken on the set of transport plans. It is well-known and easy to prove that this minimum exists. 
The equality 

K(fM},ni) = C(no,m) 

holds if has no atom, see pQ, Theorem 2.1. Note that this equality is very general, see |23j for 
a discussion. As a consequence, if fj>o has no atom, it is equivalent to prove the existence of an 
optimal transport map and to prove that there exists an optimal transport plan concentrated 
on the graph of a Borel function. 

Let us define the second cost function 

a(x,y) = (c(x,y)) 2 . 

This cost is chosen in order that the following refined form of Theorem 1 holds. 

Theorem 3. Let O be the set of optimal transport plans for K(no,Hi) w ^h the cost c. The 
minimum, 

min 

exists. In addition, if fio is absolutely continuous, then there is one and only one transport plan 
H realizing this optimum, and this transport plan is concentrated on the graph of a Borel function 
that is an optimal transport map for the cost c. 

This result will be proved in section [5j The idea of introducing secondary variational problem 
as in this statement has already been used by Ambrosio, Kirchheim, and Pratelli, see [3] and 
also [2]. Our treatment in section [5] is inspired from these references, although it is somewhat 
different. It allows substantial simplifications compared to the literature. 

3 Kantorovich potential and calibrated curves 

We present the decomposition in transport ray, which is the standard initial step in the construc- 
tion of optimal maps. This construction is based on well- understood general results on viscosity 
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sub-solutions of the Hamilton- Jacobi equation, as presented in [T2]. Making this connection is 
one of the novelties of the present paper. 

Since the cost function we consider satisfies the triangle inequality 

c(x,z) < c(x,y) +c(y,z) 

for all x, y and z in M, as well as the identity c(x, x) = for all x G M , we can take advantage 
of the following general duality result, inspired from Kantorovich, see for example [27], [TO] and 

Proposition 4. Given two measures /xq and \x\, there exists a function u G C(M, R) t/iai 
satisfies 

u(y) - u{x) ^ c(x,y) 

for all x and y in M, and 

K(hq,hi) = \ ud(pi - no). 
Jm 

In addition, for each optimal transport plan \i> the equality u{y) — u(x) = c(x, y) holds for 
fi-almost every (x,y) G M 2 . The function u is called a Kantorovich potential. 

The present paper is born from the observation that the Kantorovich potentials are viscos- 
ity subsolutions of the Hamilton- Jacobi equation as studied in [12]. In order to explain this 
connection, it is necessary to define the Hamiltonian function H G C 2 (T*M, R) by 

H(x,p) = max p(v) — L(x,v). 

v£T x M 

Note that the mapping d v L : TM — > T*M is a C 1 diffeomorphism, whose inverse is the 
mapping d p H. It is proved in [12] that the following properties are equivalent for a function 
weC(M,R). 

1. The function w satisfies the inequality w{y) — w(x) ^ c(x,y) for all x and y in M. 

2. The function w is a viscosity sub-solution of the Hamilton-Jacobi equation H(x,dw) = 0, 
i.e. each smooth function / : M — > R satisfies the inequality H(x,df(x)) ^ at each 
point of minimum x of the difference f — w. 

3. The function w is Lipschitz and satisfies the inequation H(x,dw x ) ^ at almost every 
point. This inequality then holds at all point of differentiability x of w. 

4. The function w is Lipschitz and, for almost every x G M, it satisfies the inequation 
Vf G T X M L(x,v) ^ dw x (v). This inequality then holds at all point of differentiability x 
of w. 

Although there may exist several Kantorovich potentials, we shall fix one of them, u, for the 
sequel. 

Definition 5. Following Fathi [12] . we call calibrated curve a continuous and piecewise C 1 curve 
7 : / — > M that satisfies 

u(7(*))-«( 7 (s))= / L( 7 (T),7(r))dT = C (7( S ),7(t)) (1) 

J s 

whenever s ^ t in J, where / is a non empty interval of R (possibly a point). A calibrated curve 
7 : / — > M is called non-trivial if the interval / has non-empty interior. 
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Note that the first of the equalities in ([T]) implies the second, since the inequalities 

u( 7 (t)) -u(j(s)) < c( 7 ( S ), 7 (t)) ^ I L( 7 (r),7(r))dr 

JS 

hold for any curve 7. It is obvious that non-trivial calibrated curves are minimizing extremals 
of L, and as a consequence they are C 2 curves. In addition, the concatenation of two calibrated 
curves is calibrated, so that each calibrated curve can be extended to a maximal calibrated 
curve, that is, its domain / cannot be further extended without loosing calibration. Note that 
I is closed when 7 is a maximal calibrated curve. 

Definition 6. We call transport ray the image of a non-trivial maximal calibrated curve. 

It will be useful to consider, following [7] and [H], the functions a and /3 : M — ► [0, 00) 
defined as follows: 

• a(x) is the supremum of all times T ^ such that there exists a calibrated curve 7 : 
[-T, 0] — ► M that satisfies 7(0) = x. 

• (5{x) is the supremum of all times T ^ such that there exists a calibrated curve 7 : 
[0, T] — ► M that satisfies 7(0) = x (the fact that a and (3 are finite is a consequence of 
Lemma [9] below) . 

Definition 7. Let us denote by T the subset of M obtained as the union of all transport rays, 
or equivalently the set of points x £ M such that a[x) + f3(x) > 0. For e ^ 0, we denote by T e 
the set of points x S M that satisfy a[x) > e and (3(x) > e. Clearly T e C T for all e ^ 0. The 
set £ := T — 7o is the set of ray ends. 

Proposition 8. The function u is differ entiable at each point of Tq. For each point x € Tq, 
there exists a single maximal calibrated curve 

7 X : [-a(x),f3(x)} — ► M such that 7^(0) = x. (2) 

This curve satisfies the relations 

du x = d v L(x,j x (0)) or equivalentely j x (0) = d p H(x,du x ). 

For each e > 0, the differential x 1 — > du x is Lipschitz on T e , or equivalently the map x 1 — ► 7z(0) 
is Lipschitz on T t . 

Proof. This proposition is Theorem 4.5.5 of Fathi's book [12]. rj 

Lemma 9. Let 7 : [a, b] — ► M be a non-trivial calibrated curve. Then, for all t €]a, b[, the 
function u is differentiable at 7 (i) and 

(see \1.4\ for the definition of 5). As a consequence, the map 7 : [a,b] — ► M is an embedding (it 
is one to one and has non-zero derivative on ]a,b[) and transport rays are non-trivial embedded 
arcs. 

Proof. Since u is a viscosity sub-solution of the Hamilton- Jacobi equation, see the equivalence 
below Proposition HI we have L(x,v) ^ du x (v) for all v € T X M at each point of differentiability 
of u. As a consequence, the inequality 

L( 7 (i),7(i))^ 7(t) ( 7 (i)) 
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holds for each t G]a, b[. Integrating the above inequality gives 




L(7(t),7(t))dt>«(7(6))-«( 7 (o)), 



which is an equality because the curve 7 is calibrated. As a consequence, we have 

du y{t) mt)) = L( 7 (t),j(t))>5 
for all t G]a, b[. rj 

Lemma 10. The functions a and (3 are bounded and upper semi- continuous, hence Borel mea- 
surable. As a consequence, the sets T and T e , e ^ 0, are Borel. 

Proof. We shall consider only the function a. We have just seen that, for each non-trivial 
calibrated curve 7 : I — > M, the function f(t) = u 07(f) is differentiable and satisfies f'(t) ^ 5. 
Since the continuous function u is bounded on the compact manifold M, we conclude that the 
functions a and (3 are bounded. In order to prove that the function a is semi-continuous, let us 
consider a sequence x n G M that is converging to a limit x and is such that a(x n ) ^ T. We have 
to prove that a[x) ^ T. There exists a sequence 7„ : [— T, 0] — ► M of calibrated curves such 
that 7 n (0) = x n . There exists a subsequence of 7„ that is converging uniformly on [— T,0] to a 
curve 7 : [— T, 0] — ► M. It is easy to see that the curve 7 is calibrated and satisfies 7(0) = x. 
As a consequence, we have a(x) ^ T. rj 



Definition 11. For x G M, let us denote by R x the union of the transport rays containing x. 
We also denote by R£ the set of points y G M such that u{y) — u(x) = c(x, y). 

Note that R x = ^ x ([—a(x),P(x)]) when x G 7o, where 7^ is given in ([21). 

Lemma 12. We have i?+ = 7 X ([0, i«/ien x G ?0; arac ^ R£ = {^1 w/ien x £ M — T . 

Proof. Let x be a point of 7o- By the calibration property of j x , we have, for t G [0,/3(x)], 
lx(t) — 7x(0) = c(7j;(0), 7r(£)), which is precisely saying that j x (t) G .R^". Conversely, let us fix a 
point x G M and let y be a point of There exists a time T ^ and a curve 7 : [0, T] — > M 
such that J Q T L(j(t), ^(t^dt = c(x,y), 7(0) = x and 7(T) = y. Since c(x,y) = u(y) — u(x), the 
curve 7 is calibrated. If x G To, then 7 = 7a;|r yi hence y = j(T) = J X (T) G TeQO, (3{x)]). If 
x tfiT , then there is no nontrivial calibrated curve starting at x, so we must have y = x in the 
above discussion, and R£ = {x}. □ 



Proposition 13. The transport plan /i is optimal for the cost c if and only if it is concentrated 
on the closed set 

U x£M {x} x i?+ = {(x,y) G M 2 : c(x,y) = u(y) - u(x)}. 

Proof. Recall from Proposition U] that any optimal transport plan is concentrated on this set. 
Reciprocally, if n is a transport plan concentrated on this set, then 



K(no,m)= / ud(/ii-/x Q )= / (u(y) - u{x))dji = I cd/j, 
Jm Jm 2 Jm 2 

and \i is thus optimal. □ 
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4 Fubini Theorem 



The geometric informations on transport rays that have been obtained in the preceding section 
imply the following crucial Fubini- like result, whose proof is the goal of the present section : 

Proposition 14. Let A be a Borel subset of T such that the intersection A n R has zero 1- 
Hausdorff measure for each transport ray R. Then the set A has zero Lebesgue measure. 

For comparison with the literature, see [7] Lemma 25, [26] section 4 or [TJ] Lemma 24, we 
mention: 

Corollary 15. The set £ = T — Tq of ray ends has zero Lebesgue measure. 

We now turn to the proof of Proposition which occupies the end of this section. The 
method is standard. Let k be the dimension of M. 

Definition 16. We call transport beam any given pair (B, x), where B is a bounded Borel subset 
of R k and x ■ B — ► M is a Lipschitz map (not necessarily one-to-one) such that: 

• There exists a bounded Borel set Q G R k ~ 1 and two bounded Borel functions a < b : $7 — ► 
K such that 

B = {(uj, s)e!ixls. t. o(w) < s < b(u)} C R k = R k ~ x x R 

• For each u> G f2, the curve Xoj : b(u)] — ► M given by Xui(s) = x(w, s ) 1S a calibrated 
curve. 

Lemma 17. If (B, x) is a transport beam, then the set A n x(-B) /ias zero Lebesgue measure. 

Proof. For each u € the curve is a bilipschitz homeomorphism onto its image. Since in 
addition, the set Anx({w} x [a(uj), b(u)]) has zero 1-Hausdorff measure, the set x _1 (A) intersects 
each vertical line {to} x R along a set of zero l-Hausdorff measure. In view of the classical Fubini 
theorem, the set x -1 (A) has zer ° Lebesgue measure in M. k . Since the /c-Hausdorff measure on B 
(associated to the restricted Euclidean metric) is the restriction to B of the Lebesgue measure 
of R k , the set x _1 ( A ) has zero fc-Hausdorff measure in B, for the Hausdorff measure associated 
to the induced metric. Since Lipschitz maps send sets of zero fc-Hausdorff measure onto sets of 
zero A:-Hausdorff measure, we conclude that the set A n x(B) C x(X -1 ( A )) has zero Lebesgue 
measure in M. rj 

We can now conclude the proof of Proposition O by the following lemma. 

Lemma 18. There exists a countable family (Bij,Xi.j),(i,j) £ N 2 of transport beams such that 
the images Xi,j(Bi,j) cover the set T. 

PROOF. Let D be the closed unit ball in R fc_1 . Let ipi : D — > M, i G N be a countable family 
of smooth embeddings such that, for each maximal calibrated curve 7 : [a, b] — ► M, the curve 
7(]o, b[) intersects the image of ipi for some i £ N. In order to build such a family of embeddings, 
let us consider a finite atlas of M composed of charts : B3 — * M, where B r is the open ball 
of radius r centered at zero in R k . We assume that the finite family of open sets 9{B\),9 G 
covers M. For n = 1, . . . , k and q G Qn [-1,1], we consider the embedded disk T> nq C B% 
formed by points x = (xi, . . . , x/~) G B2 which satisfy x n = q. The countable family 

0(V ntq );9 G 0; n = 1, . . . , k; q G Q D [-1, 1] 
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of embedded disks of M forms a web which intersects all non-trivial curves of M , hence all 
transport rays. We have constructed a countable family of embedded disks which intersects all 
transport rays. 

For each G N 2 let us consider the set Ojj = D n ) • Let aij(oj) and bij(u)) : 

Ojj — > R be the functions —a o ^ and /3 o Let -Bjj be the set of points (to, s) C fijj x R 
such that a,ij{uj) ^ s ^ bij(ui). To finish, we define the map Xij : -Bjj — ► M by 

Xi,i(w,s) = 7^h( s )- 

We claim that, for each (i, j) G N 2 , the map is Lipschitz, so that the pair (Bij,Xi,j) is a 
transport beam. In order to prove this claim, remember that there exists a vector field E on 
TM, the Euler-Lagrange vector field, such that the extremals are the projections of the integral 
curves of E. Because of energy conservation, this vector field generates a complete flow, denoted 
by f s : TM — > TM for s G R. From the fact that the Hamiltonian H is C 2 and f s is in Legendre 
duality with the Hamiltonian flow, we deduce that (s,x,v) — > f s (x,v) is C . We have 

Xi,j(x, s) = P M ° f s (x, 7^(^(0)) V(x, s) G Sjj, 

where Pm : TM — > M is the canonical projection on M. This map is Lipschitz in view of Propo- 
sition [HJ If R is a transport ray, it is clear that R is contained in one of the images Xi,j(Bij)- □ 



5 The distinguished transport plan 

We shall now prove Theorem 2, and hence Theorem 1. Our approach is based on remarks in 
[2] and [3], however it seems new, and is surprisingly simple. Let /i be a transport plan that 
is optimal for the cost c, and, among these optimal transport plans, minimizes the functional 
J ad/j,. The existence of such a plan is straightforward. 

Proposition 19. There exists a set 

T C U xeM {x] x R+, (3) 

which is a countable union of compact sets, such that n(T) = 1 and which is monotone in 
the following sense: If (xi,yi),i 6 {1,... , k} is a finite family of points of T and if j(i) is a 
permutation such that € i?+ then 

k k 

i=i i=i 

Proof. Let us consider the cost function £, where ((x,y) : M x M — ► [0,oo] is the lower 
semi-continuous function defined by ((x,y) = a(x,y) if u(y) — u{x) = c(x,y) and ((x,y) = oo 
if not. Note that j Qd[i = j adjj, is finite. Theorem 3.2 of [2] implies the existence of a Borel 
set r on which fi is concentrated, and which is monotone. By interior regularity of the Borel 
measure there exists a set T C T which is a countable union of compact sets and on which fi 
is concentrated. Being a subset of the monotone set T, the set V is itself monotone. □ 

Definition 20. Let A be the set of points x G M such that the set 

T x := {y GM: (x,y) G T} 
contains more than one point, where T is defined in ([3]). 
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Lemma 21. The set A is Borel measurable. 

Proof. Let K n , n £ N be an increasing sequence of compact sets such that T = L)K n . For each 
x € M, let 5 n (x) be the diameter of the compact set K™ of points y £ M such that (x, y) € K n . 
It is not hard to see that the function 5 n (x) is upper semi-continuous, hence Borel measur- 
able. Since F x = U ne ^K^, we have 5(x) = sup n 5 n (x), where 5(x) is the diameter of T x . As 
a consequence, the function 5 is Borel measurable, and the set A = {x € M, 5(x) > 0} is Borel. □ 

Proposition 22. We have A C T, and the intersection A n R is at most countable for each 
transport ray R. 

Proof. If x g" T, then i?+ = {x} hence T x C {x}, and x £ A. Let us now consider a 
transport ray R that is the image of a maximal calibrated curve 7 : [a, 0\ — > M. Let us 
denote by h : [a,0\ — ► K the function 1107, which is strictly increasing (Lemma [9]). Note that 
(7(7(5), j(t)) = (h(t) — h(s)) 2 for s ^ t in [a,/3]. In view of the monotonicity of F, we have 

(h(t) - h(s)) 2 + (h(t') - h(s')) 2 ^ (h(t) - h(s')) 2 + {h(t') - h(s)) 2 

or equivalently 

(h(t) - h(t'))(h(s) - h(s')) > (4) 

whenever (7(5), 7(£)) £ T, (7(5'), 7(t')) £ T, s' ^ £, and s ^ Following [16] or [2], we observe 
that this implies the property: 

(7(«). 7(0) er, (700,7(0) er, s < s ' t < t'. (5) 

This property implies that the set of values of s in [a, f3] such that P-y( s ) contains more than 
one element is at most countable. Indeed, for all integers n ^ 1, let S n be the set of values 
of s € [ot,0\ such that there exist £1, £2 € with (7(5), 7(^1)) 6 T, (7(s),7(t2)) G P and 

£2 — £1 1/n. If s < s' are in SVi and if £i,£2 an d £j, £2 are as above with respect to s and s' , 
then a ^ ti < £2 — 1/w < *2 ^ ^ *2 ~~ V n < *2 ^ P anci tnus P — a. ^ 2/n. More generally, if 
5 n contains at least j points, then — a ^ j /n. As the interval [a,0\ is bounded, the set S n is 
finite for all n, which leads to the conclusion. rj 

Theorem 2 can now be proved in a very standard way. In view of section U the set A has 
zero Lebesgue measure in M. The set Z = M — A is a Borel set of full Lebesgue measure, 
fJ>o{Z) = 1. Denoting by ttq : M x M — > M the projection on the first factor, we observe that 
the set T z = T n tt^Z) is a Borel graph on which fj, is concentrated (because /i(r) = 1). By 
the easy Proposition 2.1 of [1], we conclude that the plan [i is induced from a transport map F. 
We then have 

/ cdfi= c(x,F(x))dn (x) = K{hq,hx) = C(// ,/ii), 
JMxM Jm 

so that the map F is optimal for the cost c. This ends the proof of Theorem 2 and Theorem 1. 
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